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Abstract. These pages contain a short overview on the state of the art of effi- 
cient numerical analysis methods that solve systems of multivariate polynomial 
equations. We focus on the work of Steve Smale who initiated this research 
framework, and on the collaboration between Stephen Smale and Michael 
Shub, which set the foundations of this approach to polynomial system-solving, 
culminating in the more recent advances of Carlos Beltran, Luis Miguel Pardo, 
Peter Biirgisser and Felipe Cucker. 



1. The modern numerical approach to polynomial system solving 

In this paper we survey some of the recent advances in the solution of polynomial 
systems. Such a classical topic has been studied by hundreds of authors from many 
different perspectives. We do not intend to make a complete historical description 
of all the advances achieved during the last century or two, but rather to describe 
in some detail the state of the art of what we think is the most successful (both 
from practical and theoretical perspectives) approach. Homotopy methods are 
used to solve polynomial systems in real life applications all around the world. 

The key ingredient of homotopy methods is a one-line thought: given a goal 
system to be solved, choose some other system (similar in form, say with the same 
degree and number of variables) with a known solution (^o, and move this new 
system to the goal system, tracking how the known solution moves to a solution of 
the goal. Before stating any notation, we can explain briefly why this process is 
reasonable: if for every t G [0, 1] we have a system of equations ft (/o is the system 
with a known solution, /i is the one we want to solve), then we are looking for a path 
Ct, t G [0, 1], such that /t(Ct) = 0. As long as the derivative d/t(Ct) is invcrtible for 
all t we can continue the solution from /q to /i, by the implicit function theorem. 
Now we have various methods to accomplish this continuation. We can slowly 
increment t and use iterative numerical solution methods such as Newton's method 
to track the solution or we may differentiate the expresion /t(Ct) = and solve for 
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d/{dt)(Q) — Ct. Then, we can write our problem as an initial value problem: 

1 (^0 known 

Systems of ODEs have been much studied and hence this is an interesting idea: 
we have reduced our original problem to a very much studied one. One can just 
plug in a standard numerical ODE solver such as backward Euler or a version of 
Runge-Kutta's method. Even then, in practice, it is desirable to, from time to time, 
perform some steps of Newton's method z — > x — Dft{x)~^ ft{x) to our approxima- 
tion zt of Ct, to get closer to the path (/t,Ct)- After some testing and adjustment 
of parameters, this nai've idea can be made to work with impressive practical per- 
formance and there are several software packages which attain spectacular results 
(solving systems with many variables and high degree) in a surprisingly short run- 
ning time, see for example [4TJ [7J |62l |42] 

From a mathematical point of view, there are several things in the process we 
have just described that need to be analyzed: will there actually exist a path Q 
(maybe it is only defined for, say, t < 1/2)? what is the expected complexity of 
the process (in particular, can we expect average polynomial running time in some 
sense)? what "simple system with a known solution" should we start at? how 
should we join /o and /i, that is what should be the path /(? 

In the last few decades a lot of progress has been made in studying these ques- 
tions. This progress is the topic of this paper. 

2. A TECHNICAL DESCRIPTION OF THE PROBLEM 

We will center our attention in Smale's 17-th problem, which we recall now. 

Problem 1. Can a zero of n complex polynomial equations in n unknowns he found 
approximately, on the average, in polynomial time with a uniform algorithm? 

We have written in bold the technical terms that need to be clarified. 

In order to understand the details of the problem and the solution suggested 
in Section [TJ we need to describe some important concepts and notation in detail. 
Maybe the first one is our understanding of what a "solution" is: clearly, one 
cannot expect solutions of polynomial systems to be rational numbers, so one can 
only search for "quasi-solutions" in some sense. There are several definitions of 
such a thing, the most stable being the following one (introduced in [56], see also 

nsmg): 

Definition 1. Given a polynomial system, understood as a mapping f : C" — > C", 
an approximate zero of f with associated (exact) zero C, is a vector zo G C" such 
that 

ll^fc-CII < ^Iko-CII, fc>o, 

where Zk is the result of applying k times Newton's operator z ^ z — Df{z)^^f{z) 
(note that the definition of approximate zero implicitly assumes that Zk is defined 
for all k>0.) 

The power of this definition is that, as we will see below, given any polynomial 
system / and any exact zero C S C", approximate zeros of / with associated zero 
C exist whenever Df{C,) is an invertible matrix. 
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Recall that our first goal is to transform the problem of polynomial system solving 
into an implicit function problem or an ODE system like that of (jl.ip . There exist 
two principal reasons why the solution of such a system can fail to be defined for 
all t > 0: that the function defining the derivative is not everywhere defined (this 
corresponds naturally to Dft{Ct) not being invertible), and that the solution escapes 
to infinity. The first problem seems to be more delicate and difficult to solve, but 
the second one is actually very easily dealt with: we just need to define our ODE in 
a compact manifold, instead of just in C". The most similar compact manifold to 
C" is P(C"+^), and the way to take the problem into P(C"+^) is just homogenizing 
the equations. 

Definition 2. Let f : C" — > C" be a polynomial system, that is f = (/i, . . . ,/„) 
where fi : C" — > C is a polynomial of degree some di, 

f{xi,...,Xn)= '^ai\...,a„'''l^ ■ ■ ■ -^n"- 

ctiH haji^di 

The homogeneous counterpart of f is h : C"^^ — > C" defined by h = (/ii, . . . , ft,„) 
where 



h{xo,Xi,...,Xn) ^ ^ 

aiH \-an<di 



We will talk about such a system h simply as an homogeneous system. 

Note that if C is a zero of / then (1,C) is a zero of the homogeneous counter- 
part h oi f. Reciprocally, if C = {(o,(i, ■ ■ ■ ,Cn) is a zero of h and if Co 7^ 0, then 
(C1/C07 • ■ • : Cn/Co) is a zero of /. Thus, the zeros of / and h are in correspondence 
and we can think of solving h and then recovering the zeros of / (this is not a com- 
pletely obvious process when we only have approximate zeros, see [15].) Moreover, 
it is clear that for any complex number A S C and for x G C"^^ we have 

h{Xx) = Diag{X'^\...,X'^")h{x), 

and thus the zeros of h lie naturally in the projective space P(C"'''^). 

As we will be working with homogeneous systems and projective zeros, we need 
a definition of approximate zero in the spirit of Definition [T] which is amenable 
to a projective setting. The following one, which uses the projective version [49] 
of Newton's operator, makes the work. Here and throughout the paper, given a 
matrix or vector A, by A* we mean the complex conjugate transpose of A, and by 
dii{x, y) we mean the Riemannian distance from x to y, where x and y are elements 
in some Riemannian manifold. 

Definition 3. Given an homogeneous system h, an approximate zero of h with 
associated (exact) zero ^ G P(C"^"'") is a vector zq g P(C"^"'") such that 

dfiizk, C) < ^jTrr^^flX^o, C), fc > 0, 

where Zk is the result of applying k times the projective Newton operator z i— >■ 
z — Dh[z) \~}^ h{z) (again, the definition of approximate zero implicitly assumes 
that Zk is defined for all k > 0.) Here, by Df{z) we mean the restriction of the 
derivative of h at z, to the (complex) orthogonal subspace = {2/ € C""*"^ : y*x = 
0}. 



4 



CARLOS BELTRAN AND MICHAEL SHUB 



It is a simple exercise to verify that (projective) Newton's method is well defined, 
that is the point it defines in projective space does not depend on the representative 
z £ C"+^ chosen for a point in projective space. 

A (projective) approximate zero of h is thus a projective point such that the 
successive iterates of the projective Newton operator quickly approach an exact 
zero of h. Thus finding an approximate zero is an excellent output of a numerical 
zero-finding algorithm to solve h. 

Because we are going to consider paths of systems {/it}tg[a.ti]i it is convenient 
to fix a framework where one can define these nicely. To this end, we consider the 
vector space of homogeneous polynomials of fixed degree s > 1: 

Us ~ {h (z C[xq^ . . . ^ Xn] : h is homogeneous of degree s}. 

It is convenient to consider an Hermitian product (and the associated metric) on 
Us- A desirable property of such a metric is the unitary invariance, namely, we 
would like to have an Hermitian product such that 

{h,g)n^ = {hoU,goU)n^, yUeUn+i, 

where Un+i is the group of unitary matrices of size n + 1. Such property was studied 
in detail in (FT] . It turns out that there exists a unique (up to scalar multiplication) 
Hermitian product that satisfies it, the one defined as follows: 

aoH i-a„=s qqH ha,i=s 



/ . g\ "■au....a„"ai,...,a„, 

aoH han=s 

where " just means complex conjugation. Note that this is just a weighted version 
of the standard complex Hermitian product in complex affine space. 

Then, given a list of degrees (d) = (di, . . . , (i„), we consider the vector space 

n 

1=1 

Note that an element h of can be seen both as a mapping h : C"+^ — > C" or as 
a polynomial system, and can be identified by the list of coefficients oi hi, . . . , hn- 
We denote by P(H(rf)) the projective space associated to 'H(d), by the complex 
dimension of P('H(d)) (so the dimension of 'H(d) is A^ + 1) and we consider the 
following Hermitian structure in H^d)' 



This Hermitian product (and the associate Hermitian structure and metric) is also 
called the Bombieri-Weyl or the Kostlan product (structure, metric). As usual, 
this Hermitian product in 'H(^d) defines an associated Riemannian structure given 
by the real part of (•, •). We can thus consider integrals of functions defined on 

We denote by § the unit sphere in 'H(ti), and we endow S with the inherited 
Riemannian structure from that of 'H(^d)- Then, P('H(d)) has a natural Riemannian 
structure, the unique one making the projection § — )■ Fiji^d)) a Riemannian sub- 
mersion. That is the derivative of the projection restricted to the normal to the 
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fibers is an isometry. We can thus also consider integrals of functions defined in § or 
¥{'H(d))- We can now talk about probabilities in § or P{7i(^d))- given a measurable 
(nonnegative or integrable) mapping X defined in § or P('H((i)), we can consider its 
expected value: 

where we simply denote by i'{E) the volume of a Ricmannian manifold E. Simi- 
larly, one can talk about probabilities in 'H(ci) according to the standard Gaussian 
distribution compatible with (•, •): given a measurable (nonnegative or integrable) 
mapping X defined in its expected value is: 

E«(.,(^) = 7^4vTT / Xih)e-\\'^\\'/'dh. 

We can now come back to Problem [T] and see what do each of the terms in that 
problem mean: Smale himself points out that one can just solve homogeneous 
systems (as suggested above). Wc still have a few terms to clarify: 

• found approximately. This means finding an approximate zero in the sense 
of Definition [H 

• on the average, in polynomial time. This now means that, if X(h) is the 
time needed by the algorithm to output an approximate zero of the input 
system h, then the expected value of X is a quantity polynomial in the 
input size, that is polynomial in N . The number of variables, n, and the 
maximum of the degrees, d, are smaller than N , and hence one attempts 
to get a bound on the expected value of X, as a polynomial in n, d, N. 

• uniform algorithm. Smale demands an algorithm in the Blum-Shub-Smale 
model [211 HD] , that is exact operations and comparisons between real num- 
bers are assumed. This assumption departs from the actual performance 
of our computers, but it is close enough to be translated to performance in 
many situations. Uniform means that the same algorithm works for all (d) 
and n. 



3. Geometry and condition number 

We can now set up a geometric framework for homotopy methods. Consider the 
following set, usually called the solution variety: 

(3.1) v^{{hx)& nn^d)) X P(C"+i) : HO = 0}. 

This set is actually a smooth complex submanifold (as well as a complex algebraic 
subvariety) of F{H(d)) x P(C""''^), see [20], and is clearly compact. It will be useful 
to consider the following diagram. 

V 

(3.2) ""'^ ^ ^2 

PCH(d)) P(C"+i) 

It is clear that 7rj~^(ft,) is a copy of the zero set of h. Reciprocally, for fixed ^ G 
P(C"'"'""'^), the set 7^2^(0 is the vector space of polynomial systems that have C as a 
zero. 
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Let E' C V be the set of critical points of tti and S = t^iC^') ^ the set 

of critical values of tti. It is not hard to prove that: 

• TTi restricted to the set V \ n^^i^) is a (smooth) 2?-fold covering map, 
where I? = di • • • d„ is the Bezout number. 

• E' = {(/i, C) G V : Dh{() has non-maximal rank}. In that case, we say 
that C is a singular zero of h. Otherwise, we say that C is a regular zero 
of h. 

This means, in particular, that the homotopy process described above can be carried 
out whenever the path of systems lies outside of S: 

Theorem 4. Let {ht : t E [^jb]} be a curve in F{H(^d)) \ ^ o.'i^d, let C be a 
zero of ha- Then, there exists a unique lift of ht through tti, that is a curve 
{ht,Ct) G V such that = In particular, (^h is a zero of hi,. Moreover, the lifted 
curve satisfies: 

(3.3) = {ht,Dht{(:t) \-l kiCt)) ■ 

Finally, the set T, C P('H(rf)) is a complex proper algebraic variety, thus it has real 
codimension 2 and the projection of most real lines in T-L^d) io IP(^(d)) does not 
intersect E. 

In the case the thesis of Theorem S] holds we just say that can be continued 
to a zero of ha- One can be even more precise: 

Theorem 5. Let {ht : t G [a, b]} be a curve in F{7ii^d))\ S and let ( be a zero of 
ha- Then, every zero C of ha can be continued to a zero of hb, defining a bijection 
between the V zeros of ha and those ofh^. 

Remark 6. Even if ht cross E some solutions may be able to be continued while 
others may not. 

The (normalized) condition number |51| is a quantity describing "how close to 
singular" a zero is. Given h G T~L[d) and z S P(C""''^), let 

(3.4) A^(/,z) = 11/11 !|(m(z) l.)-^D^ag{\\zr^~H\")h, 

and ti(f,z) = +0O if Dh(z) j^x is not invertible. Sometimes /i is denoted /.tnorm 
or /Uproj but we prefer to keep the more simple notation here. One of the most 
important properties of fi is that it is an upper bound for the norm of the (locally 
defined) implicit function related to tti in p.2p . Namely, let (/i, C) G T(^hx)^ where 
{h, C) G V is such that fi{h, Q < +oo. Then, 

(3.5) licil <M^,C)I|A||, KhX)>V^- 

We also have the following result. 
Theorem 7 (Condition Number Theorem, [51]). 

sm (rf_R(/l, E;;)) 

where da is the Riemannian distance in P('H(d)) and 

E<; = {ft e P('H(d)) : ft(C) = 0, and Dh{C) \c_i- is not invertible}. 
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Note that this is a version of the classical Condition Number Theorem of linear 
algebra (see Theorem [27] below) . The existence of approximate zeros in the sense 
of Definition 13] above is also guaranteed by this condition number, as was noted in 
[STj . More precisely: 

Theorem 8 (//-Theorem, [STj). There exists a constant uq > (uq = 0.17586 
suffices) with the following property. Let {h,C,) £ V and let z e P(C"+"'^) satisfy 

Then, z is an approximate zero of h with associated zero C,. 

4. The complexity of following a homotopy path 

The sentence "can be continued" in the discussion of Section [3] can be made 
much more precise, by defining an actual path-following method. It turns out that 
the unique method that has actually been proved to correctly follow the homotopy 
paths and at the same time achieve some known complexity bound is the most 
simple one, which only uses the projective Newton operator, and not an ODE 
solver step. 

Problem 2. It would he an interesting project to compare the overall cost of using a 
higher order ODE solver to the projective Newton-based method we describe below. 
Higher order methods or even predictor-corrector methods may require fewer steps 
but be more expensive at each step so a total cost comparison is in order. Some 
experience indicates that higher order methods are rarely cheaper, if ever. See p51 

iMiiini- 

More precisely, the projective Newton-based homotopy method is as follows. 
Given a path {ht : a < t < b} C P('H(rf)), and given Za an approximate zero of 
ha with associated (exact) zero Ca, let to > be "small enough" and let 

that is Za+to is the result of one application of the projective Newton operator 
based on ha+to to the point Za- If Za is an approximate zero of ha and to is small 
enough, then Za can be close enough to the actual zero Ca+to of ^a+to to satisfy 
Theorem |S] and thus be an approximate zero of ha+to well. Then, by definition 
of approximate zero, Za+to be half-closer to Ca+to than Za. This leads to an 
inductive process (choosing ti, then t2, etc. until hb is reached) that, analysed 
in detail, can be made to work and actually programmed. The details on how to 
choose to would take us too far apart from the topic, so we just give an intuitive 
explanation: if we are to move from (ha,Ca) to {ha+toXa+to) must be sure 
that we are far enough from S' to have our algorithm behaving properly. As the 
condition number essentially measures the distance to E', it should be clear that 
the bigger the condition number, the smaller step to we can take. This idea lead to 
the following result (see [SS] for a weaker, earlier result): 

Theorem 9 ([5^). Let (htXt) C V \ S', i e [a,b] he a path. If the steps 
tQ,ti,... are correctly chosen, then an approximate zero of hi, is reached at some 
point, namely there is a k > 1 such that X^iLo ti — b ~ a (k is the number of steps 
in the inductive process above.) Moreover, one can bound 

k< [Cd^/^L,], 
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where d is the maximum of the degrees in {d), C is some universal constant, and 
(4.1) L.= f\{huCt)\\{htXt)\\dt 

is the condition lenght of the path (ht^C^t)- Moreover, the amount of arithmetic 
operations needed in each step is polynomial in the input size N , and hence the 
total complexity of the path-following procedure is a quantity polynomial in N and 
linear in 

There exist several ways to algorithmically produce the steps to,ti,. . . in this 
theorem (and indeed the process has been programmed in two versions |121 113 j .) 
but the details are too technical for this report, see [SI EH [31]. We also point out 
that, if the path we are following is linear, i.e. ht — {1 — t)ho + thi, and if the input 
coordinates are rational numbers, then all the operations can be carried out over 
the rationals without a dramatic increase of the bit size of intermediate results, see 

Ha- 

Note that since is a length it is independent of the parametrization of the 
path. If we specify a path of polynomial systems in ^{d) then we project the path 
of polynomials and solutions into V to calculate the length. We may project from 
H{d) to S first and reparametrizc if we wish. For example, we project the straight 
line segment ht = {\ — t)g +th ioi Q < t < 1 into S and reparametrizc by arc-length. 
If \\g\\ = \\h\\ = 1 the resulting curve is 

7 / \ h — (h, g)g . 
ht = gcos{t) + — • sm < 

\\h- {h,g)g\\ 

which is an arc of great circle through g and h. If < i < dj^{g, h). then the arc 
goes from g to h. Here df;{g, h) is the Riemannian distance in § between g and h 
which is the angle between them. 

5. The problem of good starting points 

We now come back to the original question in Smale's 17-th problem. Our plan 
is to analyse the complexity of an algorithm that we could call "linear homotopy" : 
choose some g e §, C G P(C"+^) such that g{Q — (we will call {gX) ^ "starting 
pair"). For input /i G §, consider the path contained in the great circle : 

(5.1) ht^gco^{t)+ sin(t), t e [Q,dR{g,h)]. 

\\h~ {h,g)g\\ 

Then, use the method described in Theorem [H] to track how Co moves to C,dn{g,h)^ a- 
zero of hdj^(gji) = h, thus producing an approximate zero of h. We call this linear 
homotopy (maybe a more appropriate name would be "great circle homotopy" ) 
because great circles are projections on § of segments in 'H(d)- 

Assuming that the input h is uniformly distributed on S, we can give an upper 
bound for the average number of arithmetic operations needed for this task (that 
is, the average complexity of the linear homotopy method) by a polynomial in N 
multiplied by the following quantity: 

— / KhuCtmht,Ct)\\dtd§, 

JhGS Jo 

where ht is defined by (|5.1|) and (t is defined by continuation (the fact that htCiT, = 
ptyset, and thus the existence of such (t, is granted by Theorem [4] for most choices 
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of g^h). It is convenient to replace this last expected value by a similar upper 
bound: 

^1(5,0 = ^ / r Kht,ctmkxt)\\dtds. 

^[^) Jhes Jo 

Note that we are just replacing the integral from to dn^g, h) by the distance from 

to TT. 

We thus have: 

Theorem 10. Let (g, C) S ^- The average complexity of linear homotopy with 
starting pair {g,C) hounded above by a polynomial in N multiplied by Ai{g,C,). 

This justifies the following definition: 

Definition 11. Fix some polynomial p G R[x,y, z]. We say that {g,C) 'is a good 
starting pair w.r.t. p{x, y, z) if Ai{g, C) < pin, d, N) (which implies that the average 
number of steps of the linear homotopy is 0{d^^^p{n,d, N)).) From now on, if 
nothing is said, we assume p{x , y , z) = \f2-nxz. Thus, (g, C) G V is a good initial 
pair ifAiigX) < V^ttuN . 

So, if a good initial pair is known for all choices of n and the list of degrees (d), 
then the total average complexity of linear homotopy is polynomial in N. In other 
words, finding good starting pairs for every choice of n and (d) gives a satisfactory 
solution to Problem ([T]). 

In [55| the following paii0 was conjectured to be a good starting pair (for some 
polynomial p{x, y)) : 

{dY^z^^^^zi, 
■■ , c = (i,o,...,o). 

To this date, proving this conjecture is still an open problem. Some experimental 
data supporting this conjecture was shown in [12] 

5.1. Choosing initial pairs at random: an Average Las Vegas algorithm 
for problem dT]). One can study the average value of the quantity Ai{gX) de- 
scribed above. Most of the results in this section are based in the fact that the 
expected value of the square of the condition number is relatively small. This was 
first noted in [52] , then this expected value was computed exactly in |16| : 

Theorem 12. Let h € S be chosen at random, and let ^ be chosen at random, with 
the uniform distribution, among the zeros ofh. Then, the expected value of ^"^{h^Q 
is at most nNV. More exactly: 

Eh&l J2 f^ih,C)A =VN (^n(^l + ^^ -2n-?j<nNV. 



^Because n,d < N, wc could just talk about a one variable polynomial p{x) and change 
p{n, d, N) to p{N) in the following definition. However, we prefer here to be a bit more precise. 

^Thc pair conjectured in |55| does not contain the extra factors. There is, however, some 
consensus that these extra factors should be added, for with these factors the condition number 
f-idt C) = n^^^ is minimal. 
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In particular, in the case of one homogeneous polynomial of degree d (i.e. n = 1,) 
we have: 

E;,es J2 M(/i,C)' ^did+1). 

\C:ft(C)=0 / 

Now we use some arguments which arc very much inspired by ideas from integral 
geometry, one of the main contributions of Lluis Santalo to XX century mathemat- 
ics. We can try to compute the expected value of Ai{g, ()■ Although this can be 
done directly (see |18|.) it is easier to first consider an upper bound of Ai: let us 
note from p.5|) that 

(5.3) A,igX)<^, f f\{htXt? dt dS. 

f^(s) Aes Jo 

So, we have 

V2 

kC:ff(C)=0 / \C:'i{C)=0 ' 



fi{htXt)^ dt dS. 




where Lg ^ is the half-great circle in S containing g, h, starting at g and going to 
—g (we have to remove from this argument the case h = —g but this is unimportant 
for integration purposes.) Note that we can define a measure and more generally 
a concept of integral in § as follows: given any measurable function g : § — >• [0, oo), 
its integral is 

(5.4) Eg_/jgsxs 

Now, this last formula describes an invariant (with respect to the group of sym- 
metries of S, that can be identified with the unitary group of size A'^ -I- 1 or with 
the orthogonal group of size 2N + 2) measure in S and is thus equal to a multiple 
of the usual measure in §. In words, averaging over § or over great circles in § is 
the same, up to a constant. The constant is easy to compute by considering the 
constant function q= 1. What we get is: 

E,es[ -4i(5,C) <^E^6s( Y f(^^0 

\C:9(C)=0 / ^ \CMC)=0 

After this argument is made rigorous, we have (see [Mj [15] for earlier versions of 
the following result:) 

Theorem 13 (|16]). Let g G § 6e chosen at random with the uniform distribution, 
and let C be chosen at random, with the uniform ( discrete ) distribution among the 
roots of g. Then, the expected value of Ai{g, C) is at most -^nN . In particular, for 
such a randomly chosen pair {g, C,), with probability at least 1/2 we have Ai{g, C) ^ 
\/2'KnN , that is, {g, C) is a good starting pai^. 



■^Note that we are computing there the average of Ai not that of the integral of fj,^ as in 1161 . 
From 115.311 . the constant \/2 has to be added to the formula in | 16| in this context. 
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The previous result would be useless for describing an algorithm (because choos- 
ing a random zero of a randomly chosen g € S might be a difficult problem) without 
the following one. 

Theorem 14 ([16]). The process of choosing a random g (z S and a random zero 
C of g can be emulated by a simple linear algebra procedure. 

The details of the linear algebra procedure of Theorem [14] require the introduc- 
tion of too much notation. We just describe the process in words: one has to choose 
a random n x (n -I- 1) matrix M with complex entries, compute its kernel (a pro- 
jective point C G P(C""''^)) and consider the system g GE> that has ^ as a zero and 
whose linear part is given by M. A random higher-degree term has to be added to 
g, and then linear and higher-degree terms must be correctly weighted. This whole 
process has running time polynomial in N. We thus have: 

Corollary 15. The linear homotopy algorithm with the starting pair obtained as 
in Theorem \14\ has average complexit'^ O(N^). 

The word "average" in Corollary [TS] must be understood as follows. For an input 
system h, let T(h) be the average running time of the linear homotopy algorithm, 
when {g, Q is randomly chosen following the procedure of Theorem 1141 Then, the 
expected value of T{h) for random h is 0{N'^). This kind of algorithm is called 
Average Las Vegas, the "Las Vegas" term coming from the fact that a random 
choice has to be done. The user of the algorithm plays the role of a Las Vegas 
casino, not of a Las Vegas gambler: the chances of winning (i.e. getting a fast 
answer to our problem) arc much higher than those of loosing (i.e. waiting for a 
long time before getting an answer.) 

Some of the higher moments of Ai{g, Q have also been proved to be small. For 
example, the second moment (thus, also the variance) of Ai{g, C) is polynomial in 
N , as the following result shows: 

Theorem 16 (TF). Let 2 <k <3. Then, the expectation of Ai{g,C)^ satisfies 

E(^i(g,C)'-) <oo. 

Moreover, let 2 < k < 3 — 2inV ^^^^j expectation E (^1(3, C)'^) satisfies, 

E {Aiig, 0'') < 22'^'+'^/2+4 g ^fc^3fc-4^2p4fc-8 

In particular, E{Ai{g,C)^) < 5l2eTr'^n'^N^\nV. 

We have been concentrating on finding one zero of a polynomial system. But 
we could find k zeros < k <'D hy choosing k different random initial pairs using 
Theorem 1 141 This process is known from |16j to ouput every zero of the goal system 
h with the same probability l/V, if ft, ^ E. Another option is to choose some initial 
system g which has k known zeros, and simultaneously continuing the k homotopy 
paths with the algorithm of Theorem [9] In the case of finding all zeros the sum of 
the number of steps to follow each path, is by Theorem [9] and (|3.5p bounded above 
by a constant times 

dR(g.h) 

J2 KhuCtfdt. 

Cf-htiCt)=0 




^We use here the 0(X) notation: this is the same as 0{X log{X)'') for some constant c, that 
is logarithmic factors are cleaned up to make formulas look prettier. 
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So for the great circle homotopies we have been discussing an analogous of Theorem 
[m holds: 

Theorem 17 ( 1161 ). Let g CzS be chosen at random with the uniform distribution. 
Then, the expected value of Jq"^^^''^^ J2ctht(Ct)=o l^(^t,Ct)'^ dt is at most ^nNT). In 
particular, for such a randomly chosen g, with probability at least 1/2 we have 
jdR{g,h) 'Y^^^^^^^^^^^ dt < TmNT), that is, the linear homotopy for finding 

all zeros starting at g takes at most a constant times d^^^uND steps to output all 
zeros of h, on the average. 

Note that in general, one cannot write down all the T> zeros of g to begin with, 
so Theorem [T7] does not immediately yield a practical algorithm. 

We point out that, even for the case n = 1, no explicit descriptions of pairs {g, C,) 
satisfying Ai{g,C) ^ d^^^^ are known. Of course, no explicit polynomial 5 € S is 
known in that case satisfying the claim of Theorem 1171 An attempt to determine 
such a polynomial lead to some progress in the understanding of elliptic Fekete 
points, see Section [5] 

5.2. The roots of unity combined with a method of Renegar: a quasi 
polynomial time deterministic algorithm for problem ([T|). One can also 
ask for an algorithm for Problem ([1]) which does not rely on random choices (a 
deterministic algorithm). The search of a deterministic algorithm with polynomial 
running time for Problem ([1} is still open, but a quasi-polynomial algorithm is 
known since [27]. 

This algorithm is actually a combination of two: on one hand, we consider the 
initial pair 

(5.5) 5-<; , c = (i,-..,i) 

Then, wc have: 

Theorem 18 (,27j)- The projective Newton-based homotopy method with initial 
pair has average running time polynomial in N and n'^, where d is the maxi- 

mum of the degrees. 

Theorem [TH] is a consequence of the following stronger result: 

Theorem 19 ([27j). The projective Newton-based homotopy method with initial 
pair ((7, C) G V has average running time polynomial in N and in max{/z((7, r?) : 

giv) = 0}. 

Theorem [T8l follows from Theorem [T9l and the fact that fJ-{g,ri) < 2(71 + 1)'^ for 
g given by (|5.5p for every zero rj of g. 

For small (say, bounded) values of d, the quantity n"^ is polynomial in n and 
thus polynomial in N, but for big values of d the quantity n'^ is not bounded 
by a polynomial in N, and thus Theorem [TH] does not claim the existence of a 
polynomial running time algorithm. However, it turns out that there is a previously 
known algorithm, based on the factorization of the w-resultant, that has exponential 
running time for small degrees, but polynomial running time for high degrees (this 
may seem contradictory, but it is not: when the degrees are very high, the input 
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size is big, and thus bounding the running time by a polynomial in the input size 
is sometimes possible in this case.) More precisely: 

Theorem 20 ( |471 I27|). There is an algorithm with average running time poly- 
nomial in N and V that, on input h G P('H(rf)) \ S, outputs an approximate zero 
associated to every single exact zero of h. 

Note that V is usually exponential in n, but as suggested above, if the degrees 
are very high compared to n, then V can be bounded above by a polynomial in the 
input size A'^ and thus the algorithm of Theorem [20] becomes a polynomial running 
time algorithm. 

An appropriate combination of theorems [18] and [20l using the homotopy method 
of Theorem [TH] for moderately low degrees and the symbolic-numeric method of 
Theorem [20] for moderately high degrees (or, simply, running both algorithms for 
every input and stopping whenever one of them finishes,) turns out to be quasipoly- 
nomial for every choice of n and (d) . Indeed: 

Theorem 21 ([27]). The average (for random /i G §j running time of the following 
procedure is 0{N^°^^°s'^): on every input h g ¥{T-L(^d)) \ ^> '''un simultaneously the 
algorithms of theorems I J^l and \2(j\ stopping the computation whenever one of the 
two algorithms gives an output. 

Note that the running time of this algorithm is thus quasi-polynomial in N . 
Moreover, the algorithm is deterministic because it does not involve random choices. 

5.3. Homotopy paths based in the evaluation at one point. Another ap- 
proach to construct homotopies was considered in [56] and generalized in [?] . Given 
h E 'H(d) and C, G P(C""''^), consider g = h — h^, where h^^ G T~l-{d) is defined as 

/^cW = ^^«.9(|^)mC). 

Then, g{Q = 0. So, we consider the homotopy /ij = (1 — t)g + th = h — {1 — t)h(^. 
We continue the zero C from /iq = g to hi = h. For any fixed C, for example 
C = Co = (1, 0, . . . , 0), the homotopy may be continued for almost all h G T-L(d)- Let 

K{h, C) = number of steps sufficient to continue C to a zero of h. 

Then, 

Theorem 22 ([!]). 

^ ^^'^'r(n + l)2"-l f ( ^ l^ih,V?r^ru ^\ -\\hW'/2^u 

where 

eih, ,) = I ^""'';J?^^V (TV2, n)e-V2 

T = \\Dtag{\\C\\-''^))hiC)\\, 
and r{a,n) ~ J^°° t"'^^e'~'^ dt is the incomplete gamma function. 
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In Theorem 1221 B(h,T]) is the basm of 77, which we now define. Suppose t] is 
a non-degenerate zero oi h E T-L^d)- We define the basin of jy, B{h,r]), as those 
C G P(C"+^) such that the zero ( of g = h — continues to 77 for the homotopy 
ht = {1 — t)g + th. We observe that the basins are open sets. 

Not much is known about E(_ftr). Might it be polynomial in A^? Is it even finite? 
See [3j for precise questions and motivations. Here is one: 

Problem 3. For h G 'H(d) \ o,'"'^ the volumes of all the basins of h equal? 



Let us know turn our sight back to (|4.ip . If we drop the condition number 
niht, Ct) from that formula, we get 



that is simply the length of the path [ht, Ct) in the solution variety V, taking on V 
the natural metric: the one inherited from that of the product P{H(^d)) x P(C"+^). 
The formula in (|4.ip can now be seen under a geometrical perspective: is just the 
length of the path (ht, Ct) when V is endowed with the conformal metric obtained 
by multiplying the natural one by the square of the condition number. Note that 
this new metric is only defined on W = V \ S'. We call this new metric the 
condition metric in W. This justifies the name condition length we have given 
to L^. Theorem [5] now reads simply as follows: the complexity of following a 
homotopy path (htXt) is at most a smah constant cd^/^ times the length of (htXt) 
in the condition metric. This makes the condition metric an interesting object of 
study: which are the theoretical properties of that metric? given p,q £ W, what is 
the condition length of the shortest path joining p and g? 

The first thing to point out is that is not a function, as it involves a matrix 
operator norm. However, ^ is locally Lipschitz. Thus, the condition metric is not a 
Riemannian metric (usually, one demands smoothness or at least for Riemann- 
ian metrics,) but rather we may call it a Lipschitz-Riemannian structure. This 
departs from the topic of most available books and papers dealing with geometry 
of manifolds, but there are still some things we can say. It is convenient to take a 
tour to a slightly more general kind of problems; that's the reason for the following 
section. 

6.1. Conformal Lipschitz Riemann structures and self convexity. Let M 

be a finite-dimensional Riemannian manifold, that is a smooth manifold with a 
smoothly varying inner product defined at the tangent space to each point a; G M, 
let us denote it (•, ■)x- Let a : M [0,oo) be a Lipschitz function that is there 
exists some constant K > such that 



where dR{x,y) is the Riemannian distance from x to y. Then, consider on each 
point a; G M the inner product (•, ■)a,x = a{x){-, ■)x- Note that this need no longer 
be smoothly varying with x, for a(x) is just Liptschitz. We call such a structure 
a (conformal) Lipschitz-Riemannian structure in M., and call it the a-structure. 



6. The condition Lipschitz-Riemannian structure 



L 




a{y)\<KdR{x,y), Va;,yGM, 
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The condition length of a path j{t) C M, a < t < b, is just 

La{l) = [''\h{t)\UMt)dt^ [\-/it)^^it))aMt)dt 

J a J a 

The distance between any to points p, g G M in this a-structure is defined as 
(6.1) da{p,q)^ inf ^0(7), p,qeM, 

7(t)CV 

where the infiinum is over all paths with 7(0) = p, 7(1) — q. 

A path 7(t), a < t < 6 is called a minimizing geodesic if La{j) — da{'y{a),j{b)) 
and ||7(t)||a,-^(t) = 1, that is if it minimizes the length of curves joining its extremal 
points and if it is parametrized by arc-length. Then, a curve ^{t) C M, for t in some 
(possibly unbounded) interval / is a geodesic if it is locally minimizing, namely if 
for every t in the interior of / there is some interval (a, h) (- I containing t and such 
that 7 \ \^a.b] is a minimizing geodesic. 

Each connected component of the set M with the metric given by da is a path 
metric space, and it is locally compact because M is a smooth finite-dimensional 
manifold. We arc in a position to use Gromov's version of the classical Hopf-Rinow 
theorem [321 Th.1.10], and we have: 

Theorem 23. Let M and a he as in the discussion above. Assume additionally 
that M is connected and that (M,da) is a complete metric space. Then: 

• each closed, bounded subset is compact, 

• each pair of points can be joined by a minimizing geodesic. 

Theorem [23] gives us sufficient conditions for conformal Lipschitz-Riemannian 
structures to be "well defined" in the sense that the infimum of (|6.ip becomes a 
minimum. We can go further: 

Theorem 24 ([ll]). In the notation above, any geodesic is of class C^+^v ^ that is 
it is and has a Lipschitz derivative. 

See [22] for an early version of Theorem [53] and for experiments related to this 
problem. 

One often thinks of the function a as some kind of "squared inverse of the 
distance to a bad set", so for each connected component of M the set (M, d^) will 
actually be complete. 

A natural property to ask about is the following: given p,q (z M, and given a 
geodesic j{t) such that 7(a) = p, 7(6) = q, does a attain its maximum on 7 in the 
extremes? Namely, if we think on a as some kind of squared inverse to a bad set, 
do we have to get closer to the bad set than what we are in the extremes? 

Example 25. A model to think of is Poincare half-plane with the metric given by 
the usual scalar product in M.'^ D {y > 0}, multiplied by y~^ . Geodesies then become 
just portions of vertical lines or half-circles with center at the axis y = 0. It is clear 
that, to join any two points, the geodesic does not need to become closer to the bad 
set {y = 0}. 

We can ask for more: we say that a is self-convex (an abbreviation for self-log- 
convex) if for any geodesic 7(i), the following is a convex function: 

t ^ log(a(7(t))). 
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Note that this condition is stronger than just asking for 1 1— > a{"f{t)) to be convex, 
and thus stronger than asking for the maximum of a on 7 to be at the extremal 
points. 

6.2. Convexity properties of the condition number. We have the following 
result: 

Theorem 26 ([10]). Let k > 1 and let N C E*^ 6e a submanifold without 
boundary of . LetU be the biggest open set all of whose points have 

a unique closest point in N. Then, the function a{x) ~ distance{x,N)~'^ is self- 
convex in U . 

Note that Theorem is a more general version of Example where the hori- 
zontal line {y = 0} is changed to a submanifold N. 

A well-known result usually attributed to Eckart and Young [35] and to Schmidt 
and Mirsky (see |60j ) relates the usual condition number of a full rank rectangular 
matrix to the inverse distance to the set of rank-deficient matrices: 

Theorem 27 (Condition Number Theorem of linear algebra). Let A e C"" a 
m X n matrix for some I < m < n. Let ai{A), . . . , am{A) be its singular values. 
Then, 

am{A) = distance(A, {rank-deficient matrices}). 

Ln particular, in the case of square maximal rank matrices, this we can rewrite 
this as ||^~^|| = distance{A, {rank-defficient matrices})'^ ^ , that is the (unsealed) 
condition number ||^~^|| equals the inverse of the distance from A to the set of 
singular matrices. We more generally call ct~^(A) the unsealed condition number 
of a (possibly rectangular) full-rank matrix A. 

One feels tempted to conclude from theorems [26] and [27] that the function send- 
ing a full-rank complex matrix A to the squared inverse of its smallest singular 
value (i.e. to the square of its unsealed condition number) should be self-convex. 
Indeed, one cannot apply Theorem [26] because the set of rank-deficient matrices 
is not a manifold, and because the distance to it is for many matrices (more 
precisely: whenever the multiplicity of the smallest singular value is greater than 
1) not attained in a single point. It takes a considerable effort to prove that the 
result is still true: 

Theorem 28 f[ll|). The function defined in the space of full-rank mx n matrices, 
I < m < n, as the squared inverse of the unsealed condition number, is self-convex. 

Note that this implies that, given any two complex matrices A, B of size mx n, 
and given any geodesic 7(t), a < t < 6 in the a-structure defined in 

C™" \ { rank-deficient matrices} 

by a{C) — (Jm{C)~'^ such that 7(a) = A, 7(6) = B, the maximum of a along 7 is 
a(^) or a{B). 

Note that, if a similar result could be stated for the a-structurc defined by 
{h, Q I— >■ ^{h, in W, we would have quite a nice description of how geodesies in 
the condition metric of W are. Proving this is still an open problem: 

Problem 4. Prove or disprove fj? is a self-convex function in W. 
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Note that from Theorem [71 the function is not exactly the squared inverse of 
the distance to a submanifold, but it is still something similar to that. This makes 
it plausible to believe that Problem 2] has an affirmative answer. A partial answer 
is known: 

Theorem 29 (tllj). The function h i-> eo) defined in the set {h e P('H(d)) : 
h(eo) = 0} is self-convex. Here, eo ~ (1, 0, . . . , 0). 



Although wc do not have an answer Problem IH we can actually state some 
bounds that give clues on the properties of the geodesies in the condition structure 
in W. More precisely: 

Theorem 30 (|17|). For every two pairs {hi, Ci), (/12, C2) G there exists a curve 
7t C W joining {hi,(^i) and (/i2,C2)j md such that 



c a universal constant. 

In the light of Theorem ^ this means that if one can find geodesies in the 
condition structure in W, one would be able to follow these paths in very few steps: 
just logarithmic in the condition number of the starting pair and the goal pair. 

Corollary 31. A sufficient number of projective Newton steps to follow some path 
in yV starting at the pair [g, Cq) of i5.^) to find an approximate zero associated to 
a solution C of a given system h G P('H(rf)) is 



c a universal constant. 

Note that only the logarithm of the condition number appears in Corollary [31] 
Thus, if one could find an easy way to describe condition geodesies in W, the 
average complexity of approximating them using Theorem [5] would involve just 
the expectation of the average of ln(/x) , not that of /i^ as in Theorem [T^l As a 
consequence, the average number of steps needed by such an algorithm would be 
0{nd^\TvN). See [THl Cor. 3] for a more detailed statement of this fact. At this 
point we ask a rather naive, vague question: 

Problem 5. May homotopy methods he useful in solving linear systems of equa- 
tions ? Might using geodesies help as in Corollary \31\ and the comments above ? 

Large sparse systems are frequently solved by iterative methods and the condition 
number plays a role in the error estimates. So Problem ([5|) has some plausibility. 

Remark 32. There is an exponential gap between the average number of steps 
needed by linear homotopy OijP^^nN) and those promised by the condition geodesic- 
based homotopy (which stays at a theoretical level by now, because one cannot eas- 
ily describe those geodesies) . This exponential gap occurs frequently in theoretical 
computer science. For example NP-complete problems are solvable in simply ex- 
ponential time but polynomial with a witness. The estimates for homotopies with 
condition geodesies may likely serve as a lower bound for what can be achieved. 



7. Condition geodesics and the geometry of W 
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Also, properties of geodesies as we learn them can inform the design of homotopy 
algorithms. 

There is more we can say about the geometry (and topology) of W, by studying 
the Frobenius condition number in W , which is defined as follows; 

JiihX) = \\h\\ \\Dh{C)^ Diag{\\C\\''^-'dy')\\F, V{hX) e W, 

where || ■ ||f is Frobenius norm (i.e. Trace{L* LY^^ where L* is the adjoint of L) 
and ^ is Moore-Penrose pseudoinverse. 

Remark 33. The Moore-Penrose pseudoinverse : F — > E o/ a linear operator 
L : E — > F of finite dimensional Hilbert spaces is defined as the composition 

where T^image{L) the orthogonal projection on image L, Ker{L)-^ is the or- 
thogonal complement of the nullspace of L, and is the inclusion. If A is a 
TO X (n + 1) matrix and A = UDV* is a singular value decomposition of A, 
D = Diag{ai, . . . , cr^, 0, . . . , 0) then we can write 

(7.2) A^ = VD^U\ Z)^ = L»m5(crf\...,a;:\0,...,0). 

In |19| we prove that is an equivariant Morse function defined in W with a 
unique orbit of minima given by the orbit B of the pair of (|5.2p under the action of 
the unitary group [U, {h, (,)) i-^ {h o U*,UC,). 

The function Ai{g, C) or even its upper bound (up to a \/2 factor) estimate 

Si(5,C)-^ / / fx{htXt)'dtdS 

is an average of /i^ in great circles. This remark motivates the following 

Problem 6. Is Ai{g, C) or Bi{g, C) also an equivariant Morse function whose only 
critical point set is a unique orbit of minima? 

If so, due to symmetry considerations, it is the orbit through the conjectured 
good starting point (|5.2p . Here, one may want to replace the condition number /i in 
the definition of Bi with a smooth version such as the Frobenius condition number. 
A positive solution to this problem solves our main problem: the conjectured good 
initial pair (|5.2p is not only good but even best. 

Because the Frobenius condition number is an equivariant Morse function, the 
homotopy groups of W are equal to those of B, that can be studied with standard 
tools from algebraic topology. In the case that n > 1, for example, we get: 

^o(W) = {0} 

7ri(W) = Z/aZ 

7r2(W) = Z 

^3(W) = 7rfe(5Zi„+i) (fc > 3), 

where SlAn+i is the set of special unitary matrices of size n + 1, a = gcd{n,di + 
■ ■ ■ + dn — 1) and Z/aZ is the finite cyclic group of a elements. 

In particular, we see that if all the djs are equal then a ~ 1 and W is simply 
connected; in particular, any curve can be continously deformed into a minimizing 



^see 15311 . 
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geodesic. See [19] for more results concerning the geometry of W. We can also 
prove a lower bound similar to the upper bound of Theorem 1301 

Theorem 34. let a : [a, 6] — > W 6e a curve. Then, its condition length is at 
least 



1 



^J.{a{a)) 



Remark 35. We have written Theorem \34\ using the condition metric as defined in 
this paper. The original result |19[ Prop. 11] was written for the so-called smooth 
condition length, obtained by changing fj, to Jl in the definition of the condition 
length. This change produces the ^/n + 1 factors in Theorem \34\ 

In his article |58j , Smale suggests that the input size of an instante of a numerical 
analysis problem should be augmented by log W{y) where W{y) is a weight function 
"... to be chosen with much thought..." and he suggests that " the weight is to 
resemble the reciprocal of the distance to the set of ill-posed problems." That is 
the case here. The condition numbers we have been using are comparable to the 
distance to the ill-posed problems and figure in the cost estimates. It would be 
good to develop a theory of computation which incorporates the distance to ill- 
posedness, or condition number and distance to ill-posedness in case they may not 
be comparable, (and precision in the case of round-off error) more systematically 
so that a weight function will not require additional thought. For the case of linear 
programming Renegar |48j accomplished this. It is our main motivating example 
as well as the work we have described on polynomial systems. The book [28] is the 
current state of the art. The geometry of the condition metric will to our mind 
intervene in the analysis. If floating point arithmetic is the model of arithmetic used 
then ill-posedness will include points where the output is zero as well as points where 
the output is not Lipschitz. 



8. The univariate case and elliptic Fekete points 

Let us now center our attention in the univariate case, that, once homogenized, 
turns out into the case of degree d homogeneous polynomials in two variables. One 
can then compute explicitly: 

f,ih,o-d'/-m\h\\\\{Dhio \^.)-'\\cr-'- 

If we are given a univariate polynomial /(x) and a complex zero z of /, we can also 
use the following more direct (and equivalent) formula for fi{h, () where h is the 
homogeneous counterpart of / and C = (1, z) : 

It was noted in j52j that the condition number is related to the classical problem 
of finding elliptic Fekete points, which we recall now in its computational form (see 
[3] for a survey on the state of art of this problem.) 

Given d different points xi, . . . ,Xd £ M^, let X — (xi, . . . , Xd) and 



£{X)=8{xu...,Xd)=-Y.\og\ 
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be its logarithmic potential. Sometimes £{X) is denoted by £o{X), £{0,X) or 
Vn{X). Let 5(1/2) be the Riemann sphere in M.^, that is the sphere of radius 1/2 
centered at (0, 0, 1/2), and let 

md = min £{xi,...,Xd) 

xi,....XieS{l/2) 

be the minimum value of £. A minimising d-tuplc X = (xi, . . . , Xd) is called a set 
of elliptic Fekete points 0. 

The computational problem of finding elliptic Fekete points is another of the 
problems in Smale's listQ. 

Smale's 7th problem [SH]: Can one find X = {xi, . . . , Xd) such that 

(8.1) £{X) ~ rud < clogd, c a universal constant. 



The first clue that this problem is hard comes from the fact that the value of 
is not known, even to 0{d). A general tcchniciue (valid for Riemannian manifolds) 
given by Elkies shows that 

d^ dlogd 
md>— — + 0{d). 

Wagner [63] used the stereographic projection and Hadamard's inequality to get 
another lower bound. His method was refined by Rakhmanov, Saff and Zhou [44j . 
who also proved an upper bound for md using partitions of the sphere. The lower 
bound was subsequently improved upon by Dubickas and Brauchart [33] , [53] . The 
following result summarizes the best known bounds: 

Theorem 36. Let Cd be defined^ by 

d^ dlogd ^ , 
md = — h Cdd. 



Then, 



-0.4375 < liminf Cd < limsupCd < -0.3700708... 



The relation of this problem to the condition number relies in the fact that sets of 
elliptic Fekete points are naturally "well separated" , and are thus good candidates 
to be the zeros of a "well-conditioned" polynomial, that is a polynomial all of whose 
zeros have a small condition number. In j52j Shub and Smale proved the following 
relation between the condition number and elliptic Fekete points. 

Theorem 37 ([53]). Let (i, . . . ,C,d G P(C^) be a set of projective points, and con- 
sider them as points in the Riemann sphere 5(1/2) with the usual identification 



) = 5(1/2). Let h be a degree d homogeneous polynomial such that its 



zeros 



are Ci, ••■ , Cd- Then, 

max{/i(/i,Ci) ■■l<i<d}< y^d{dTT) e^'^'^'--'^''^-"''' . 



^Such a d— tuple can also be defined as a set of d points in the sphere which maximize the 
product of their mutual distances. 

^Smale thinks on points in the unit sphere, but we may think on points in the Riemann sphere, 
as the two problems are equivalent by sending (a, b, c) 6 5(1/2) to 2(a, b, c) — (0, 0, 1). 

^The result in the original sources is written for the unit sphere, we translate it here to the 
Riemann sphere. 
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In particular, is xi, . . . ,Xd are a set of elliptic Fekete points, then 



max{/i(/i, C») : 1 < « < c?} < \/d{d+l). 

Remark 38. Let $Hc and 3m be, respectively, the real and complex part of a com- 
plex number. Here is alternative, equivalent definition for h and the Q. Instead 
of considering projective points in P(C^) we may just consider a set of complex 
numbers zi, . . . , Zd € C. Then, for 1 < i < d, we can define Q Cz S as 

/ as the polynomial whose zeros are zi, . . . , Zd, and h as the homogeneous counter- 
part of f. 

There exists no explicit known way of describing a sequence of polynomials 
satisfying niax{/i(/i, : h{(^) = 0} < d^, for any fixed constant c and d > 1. 
Theorem [37l implies that, if a c?-tuple satisfying (|8.ip can be described for any d, 
then such a sequence of polynomials can also be generated. From Theorem[T9l such 
h's are good starting points for the linear homotopy method, both for finding one 
root and for finding all roots. So, solving the elliptic Fekete points problem solves 
the starting point problem for n = 1. The reciprocal question is: does solving the 
starting point problem for n = 1 help with the Fckctc point problem? 

Problem 7. Suppose n ~ 1 and g ^ S minimizes X]^ g(^)=o /^(ff' 0^- Cii • • • i Cd 

(the zeros of g, seen as points in S{\/2)) solve Smale's 7-th problem? 

Wc have seen in Theorem [12] that the condition number of {h, (^) where h is 
chosen at random and ( is uniformly chosen at random among the zeros of h, grows 
polynomially in d. Then, Thcorcm l37l suggests that spherical points associated with 
zeros of random polynomials might produce small values of £. We can actually put 
some numbers to this idea. First, one can easily compute the average value of £ 
when xi, . . . ,Xd are chosen at random in S'(l/2), uniformly and independently with 
respect to the probability distribution induced by Lebesgue measure in iS'(l/2): 

By comparing this with Theorem 1361 we can see that random choices of points in 
the sphere already produce pretty low values of the minimal energy. One can prove 
that random polynomials actually produce points which behave better with respect 
to £: 

Theorem 39 ([3]). Let n = 1 and h Cz S be chosen at random w.r.t. the uniform 
distribution in §. Let Ci,---,C<i S{l/2) be the zeros of h. Then, the expected 
value of £((^i, . . . , (^d) equals 

d"^ d log d d 
T 4 4' 

By comparing this with Theorem 1361 we conclude that spherical points coming 
from zeros of random polynomials agree with the minimal value oi £, to order 
0{d). This result fits into a more general (yet, less precise) kind of result related 
to random sections on Riemann surfaces, see [64 1 [65 ] . 
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9. The algebraic eigenvalue problem 

The double fibration scheme proposed in p.2|) has been - at least partly - suc- 
cessfully used in other contexts. For example, in [T] a similar projection scheme 

Veig = {{A, A, v) e P(C"'+i) X P(C") : Av = Xv} 

(9.1) ^ 

P(H(rf)) P(C"+i) 

was used to study the complexity of a homotopy-based eigenvalue algorithm, ob- 
taining the following: 

Theorem 40. An homotopy algorithm can be designed that continues an eigenvalue- 
eigenvector pair (Ao, vq) of a n X n matrix Aq to one (Ai, vi) of another matrix Ai, 
the number of steps bounded above by 

C \\{A,X,v)\\fleig{A,X,v)dt, 

Jo 

c a universal constant. Here, ^eig is the condition number^ for the algebraic eigen- 
value problem , defined as 

(9.2) fle^g{A,X,v) = mixx{l,\\A\\F\\TT,±{XIn - A) |;i II}, 

where \\A\\f = trace{A* A)^/"^ is the Frobenius norm of A. 

Of course, we do not intend to summarize here the enormous amount of methods 
and papers dealing with the eigenvalue problem (see |60] for example). We just 
point out that there exists no proven polynomial-time algorithm for approximating 
eigenvalues (although different numerical methods achieve spectacular results in 
practice.) See l33| for some statistics about the QR (and Toda) algorithms for 
symmetric matrices. We don't know a good reference for the more difficult general 
case. Unshifted QR is not the fast algorithm of choice. The QR algorithm with 
Francis double shift executed on upper Hermitian matrices should be the gold 
standard. 

Problem 8. Does the QR algorithm with Francis double shift fail to attain con- 
vergence on an open subset of upper Hessenberg matrices? 

See [5] for open sets where Rayleigh quotient iteration fails, and IS] for a proof 
of convergence for normal matrices as well as a good introduction to the dynamics 
involved. 

Theorem [40] can probably be used in an analysis similar to that of Section [5] 
to complete a complexity analysis. Note that the integral in Theorem |40] is very 
similar in spirit to that of (|4.ip . This allows to introduce a condition metric in 
Veig. Some of the results in previous sections can be adapted to this new case. For 
example, an analogous of Theorem [501 holds (i.e. short geodesies exist,) see |2]. 

The eigenvalue problem and the problem of finding roots of a polynomial in 
one variable are, of course, connected. Given an n x n matrix A we may com- 
pute the characteristic polynomial of A, p(z) = det{zl — A) and then solve p{z). 
The zeros of p{z) are the eigenvalues of A. Trefethen and Bau [61] write "This 



quantity similar in spirit to the condition number fi for the polynomial system solving 
problem. 
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algorithm is not only backward unstable but unstable and should not be used" . 
Indeed when presented with a univariate polynomial p{z) to solve numerical linear 
algebra packages may convert the problem to an eigenvalue problem by consid- 
ering the companion matrix of p{z) and then solve the eigenvalue problem. If 
p{z) ^ z'^ + ad-iz'^^^ + • • • + flQ the compainon matrix is 

/O •••0 -flo \ 
1 •••0 -ai 
1 •••0 -aa 



: ■■■ ■■•0 -arf_2 

VO 1 -ad_i/ 

which is already in upper Hessenberg form. So conceivably Francis double shifted 
QR may fail to converge on an open set of companion matrices? 

Let us recall that the condition number of a polynomial and root is a property 
of the output map as a function of the input. So doesn't depend on the algorithms 
to solve the problem. This motivates the following 

Problem 9. What might explain the experience of numerical analysts, relating the 
polynomial solving methods versus that of eigenvalue solving? Might the condition 
number of the eigenvalue problem have small average over the set of nx n matrices 
with a given characteristic polynomial? 

Finally, if we consider the equation Av — Xv an equation in unknowns A and v, 
it is a set quadratic equation. So. we have n quadratic equations in n unknowns. 
By Bczout's theorem, after we homogenize, we expect 2" roots counted with mul- 
tiplicity. But there are only n eigenvalues. In [TJ [2] it is shown that the use of 
multihomogeneous Bezout theorem yields the correct zero count for this problem. 
Thus, a reasonable thing to do is to introduce a new variable a and consider the 
bilinear equation Aav = Xv which is bilinear in (a, A) and v. 

Problem 10 (see [32j). Prove an analogue of Theorem \4-0\ in the general multiho- 
mogeneous setting. 

Appendix A. A model of computation for machines with round-off 

AND INPUT ERRORS 

This section has been developed in discussions with Jean Pierre Dedieu and his 
colleagues Paola Boito and Guillaume Cheze. We thank Felipe Cucker for helpful 
comments. 

A.l. Introduction. During the second half of the 20th century, with the emer- 
gence of computers, algorithms have taken a spectacular place in mathematics, 
especially numerical algorithms (linear algebra, ode's, pde's, optimization), but 
also symbolic computation. In this context, complexity studies give a better un- 
derstanding of the intrinsic difficulty of a problem, and describe the performance 
of algorithms which solve such problems. One can associate the classical Turing 
model to symbolic computation based on integer arithmetic, and the BSS model 
to scientific computation on real numbers. However this ideal picture suffers from 
an important defect. Scientific computation does not use the exact arithmetic of 
real numbers but floating-point numbers and a finite precision arithmetic. Thus, a 



24 



CARLOS BELTRAN AND MICHAEL SHUB 



numerical algorithm designed on real numbers and the same algorithm running in 
finite precision arithmetic give a priori two different results. Any numerical analysis 
undergraduate book has at least one chapter dealing with the precision of numerical 
computations. See for example |61j or |38j . Yet, there is no solid approach to the 
definition and study of a model of computation including this aspect, as well as the 
role that conditioning of problems should play in the complexity estimates. 

Besides linear algebra problems and iterative processes, a key point to bear in 
mind is that we sometimes use floating point computers to answer decision (i.e 
Yes/No) problems, as is this matrix singular? or does this polynomial have a 
real zero?. The first attempts to use round-off machines to study decision problems 
are [30], and [29]. The authors consider questions like: under which conditions 
is the decision taken by a BSS machine the same as the decision taken by the 
corresponding round-off machine? Or, under which conditions is the decision taken 
by a round-off machine for a given input the same as the decision taken by the BSS 
machine on a nearby input? 

In these pages we point towards the development of a theory of finite precision 
computation via a description of round-off machines, size of an input, cost of a com- 
putation, single (resp. multiple) precision computations (a computation is "single 
precision" when a sufficient round-off unit 6 to reach relative precision u for any 
input X in the considered range is proportional to u), finite precision computability 
and finite precision decidability. These concepts have to be related to the intrinsic 
characteristics of the problem: its condition number (the local Lipschitz constant 
of the solution map), and its posedness (the distance to ill-posed problems). 

The model we propose is inspired by the BSS model but it stays close to real- 
life numerical computation. We prefer relative errors than absolute ones (this is 
the basis of the usual floating-point arithmetic.) We mention two papers of interest 
about the foundations of scientific computing, |26 [ I25 ) . with a point of view different 
than ours. 

A. 2. Round— off machine. A round-off machine is an implementation of a BSS- 
machine accounting for input error and round-off error of computations. These 
errors may mimic a particular floating point arithmetic but are designed to be 
more general. In particular, they are not tied down to a particular floating point 
model. Let M.°° be the disjoint union of the sets M", n > 0. For given x G we 
deflnc = max^ \xi\. A subset U C M.°° is open if it is the disjoint union of J7„ 
with Un C M" an open set. For this topology, a mapping / : R°° — !• M is continuous 
iff each restriction /n = / |r" is continuous. 

A BSS machine M is a directed graph with with several kinds of nodes including 
an input node, with input x € K°°, output nodes, computation nodes where rational 
functions are generally computed but here we restrict ourselves without loss of 
generality to the standard arithmetic operations, branching nodes (we branch on 
an inequality of the form y > 0.) A machine is a decision machine when the output 
is —1 or 1. The halting set H of M is the set of inputs giving rise to an output. 
We denote hy O : H M.°° the output map. There are a few technical concepts 
(mainly the input map I{x) and the computing endomorphism Hm) associated 
to M, the nonfamiliar reader may find formal definitions in [20l Chapter 2]. 

Given a BSS machine M defined over the real numbers as above and < 5 < 1, a 
round-off machine associated to M and i5 is another machine denoted (M, S) which 
we will call a BSS round-off machine. The nodes and state space of (M, S) are the 
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same as for M. The input map Is of {M,6) satisfies \Is{x)j — < S\I{x)j\ 

that is to say the relative error of the input is less than S for every coordinate j. 
The next node next state of (M, 6) at a computation node is similarly a relative 
approximation of the next node next state map of M and docs not necessarily 
represent an actual computation. The jth components of the next states satisfy 

\H(l\I^S)^state{x)j — HAI,state{x)j\ < S\HM,state{x)j\- 

Note that given M and J, there are many machines satisfying this definition. For 
example, M itself satisfies this definition for every S. The power of the definition is 
that certain claims will hold for every such a round-off machine, allowing us to use 
just the defining properties and not the particular structure of a given round-off 
machine. 



A. 3. Computability. In the sequel, we will only consider functions / : C R°° ^■ 
such that, for each n, the restriction /„ of / to 0„ = n R" takes its values 
in for an m depending only on n. 

Such a function is round-off computable when there exists a BSS machine M 
such that for any x € Q and any < e < 1, there exists a 6{x,e) such that any 
round-off machine {M,d{x,e)) outputs 0{x) with 

\d{x),-f{x),\<e\fix),l 

that is the output of {M,6{x,e)) is coordinatewise equal to f{x) up to relative 
precision e. 

Example 41. The function / : — > R, f{x,y) ~ xy (we can let it be zero in 
R°° \ R^J is round-off computable. Indeed, let x,y ^ and < e < 1. The output 
of a round-off machine (M, S) associated to the natural BSS machine for computing 
fix,y) is a number 

z = xyil + Si){l + S2){l + S3), 

for some 61,62, S3 bounded in absolute value by 6. It is useful to note the elementary 
inequality 



(A.l) V^n) "1-2"' V0<|m|<1. 

From this, we obviously have \z — xy\ < e\xy\ by taking 
(A.2) 6{{x,y),e)='-, 

The output of any round-off machine if x = or y = is clearly 0, and hence the 
same value for e of II A . suffices to satisfy the definition of computability. 

Example 42. The same argument proves that the function f : M°° — > R given by 
fix,,... round-off computable (say, we compute first X1X2 then 

X1X2X3 and so on) with 

(A.3) S{{x,y),e) = ^^, 

Example 43. A longer computation shows that the function f : {{x,y) e R^ : 
a; + y ^ 0} — > R, f{x,y) = x + y (again, we let it be zero in R°° \ R^J is also 
round-off computable. It suffices to take 

e 



Siix,y),e) = 



2 max ( 1 , 



x+y 



x+y 
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A more simple and still valid formula is 

Example 44. Let us now see that f{x) ^ xi + . . . + x„ is round-off computable 
in the set D, = {x € : Xi > Vi}. Indeed, let < e < 1 and let us consider 
the most simple BSS machine which computes first xi + X2, then adds X3 and so 
07(3 ^ round-off machine with precision S will produce, on input x — {xi, . . . ,Xn), 
a number 



x\ 



11(1 + ^1 



-X2 



\k=2 



\k—n— 1 



for some 6l bounded in absolute value by 6. This follows from the fact that, in 
addition to the input error on each coordinate, xi and X2 go through ri — 1 additions 
(which generate n + 1 errors), x^ goes though n — 2 additions and so on. Note that 

xi{l - Sr < XI (^{l + < + sr. 

Choosing 5 ~ at/{2n), < a < 1 O'^d using hA . _?)] we conclude that 



^1 11(1 + '^^'^) 



\k=l 



< aexi, 



and the same formula holds for X2, ■ ■ ■ ,Xn. The output of a round-off machine thus 
satisfies 



dix) =^x,il + ae^), 0<|e,|<e. 



That is, 



d{x)-Y,Xr 



^Xiaei < ^Xia\ei\ < ae^Xi, 

i—l i—1 i—1 

proving that f{x) is round-off computable in that set (just take a = 1). 

Example 45. Let us now see that f{x) = xi + . . . + a;„ is round-off computable in 
the set = {x S M°° : ^ x^ 7^ 0} . We consider the BSS machine that first adds all 
the nonnegative numbers, call a the result, then adds all the negative numbers, call 
b the result, and then computes a — b. Let < e < 1. We note that from Example 
\44\ by choosing S = ae/{2n) (some < a < Ij the round-off computation of the 
sum of positive (resp. negative) terms will be 

a = a{l + aei), b = 6(1 + 0:62), for some < |ei|, |e2| < e. 

From Example \4-3[ if we let 

\a + b\ 



that is if we let 



S{x,e)< 



|a + 6| e 
3V2Va^+b^2^' 



This is not the algorithm of choice in practical programming but is sufficient for our purposes 



here. 
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then 0{x) = J2i relative precision e. Using that + < nJ2 W^e can 

also use the formula 



(A.5) 6{x,e) 



Example 46. Combining examples \42\ and \45\ we see that the evaluation map of any 
multivariate polynomial p[xi, x„) is round-off computable in the complement 
of its zero set (just compute first the monomials and them add all the results). 

A. 4. Ill conditioned instances, condition number, posedness. Let us think 
of a function f : ft M°° —J- M°° as the solution map associated with some problem 
to be solved. The condition number associated with / and x measures the first- 
order (relative) componentwise or normwise variations of f{x) in terms of the first- 
order (relative) variations of x. 

First assume that / : J7 — >■ M, that is the function is real-valued. We say that 
X £ tt (the topological closure of ft) is well-conditioned when: 

• Either ||a;|| 7^ 0, and / can be extended to a Lipschitz function defined 
in a neighborhood of x in tt with |/(a;)| 7^ 0. In that case we define the 
componentwise condition number by 

Kf{x) = limsup_ ii'f/^^'i , 

• or / is constant in a neighborhood of x with \f{x)\ = 0. In this later case 
we define the condition number by Hf{x) — 0. 

Otherwise, we say that a; € f2 is ill-conditioned. The set of ill-conditioned instances 
is denoted by S/, and for a; G S/, we let Kf(x) = 00. 
For a general / : fi — >■ R°°, we define 

Kf{x) — sup K f. ix) (componentiseconditionnumber) 
j 

that is the condition number of / is the suprcmum of the condition numbers of 
its coordinates. Sometimes it is more useful to consider the normwise condition 
number, that we denote by the same letter as the context should make clear which 
one is used on each problem: 

Kf{x) = limsup (normwiseconditionnumber), 

x'i-^x.x'^Q — — 

We define the posedness of a problem instance x with 7^ as the distance 
to ill-posed problems: 

dix,^f) 

^ M ■ 

The relation between condition number and posedness is an important but unclear 
problem. We may expect a relation of the type 

TTfix) W Kfix)-'^ 

(condition number theorem) or at least inequalities like 

ClTTfixY' <Kf{x)-^ < C^TTfix)"' 
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for suitable positive constants Ci,pi (of. Lojasiewicz's inequality.) To get such 
a relation ill-posed problems should correspond to infinite condition numbers, but 
this is not always the case. Consider for example the decision problem: Is + j/^ < 
n? The problem is well conditioned except on the circle x^+y'^ — 11. but the distance 
to this circle determines the precision we need in the computation. 
Let Kf{x) — max{Kf{x),'Kf{x)~^). 

Example 47. For f{x) = xi ■ ■ ■ Xn defined in M.°° , it is easy to see that 



i^fi^) = v^i + ■■■ + ■ 



1 1 
— + ••• + — 



whenever Xi, . . . , Xn ^ 0- If Xi ^ for any i then Hf{x) = oo. 
On the other hand, 

min(|a;i|, . . . , |x„|) 

+ xl 

Thus, we have 



nf{x) <^xl + --- + xlj —^^-^--^ = V^Mx)-\ 



and 



V y mm(|xi|, . . . , \x„\y 

Namely, 

TTfixy^ < Kf{x) < y^TTf{x)-\ 

Example 48. For f{x) = xi + • • • + a;„ defined in H = {x e R°° '-^Xi^ 0}, 
have: 



For {x, y) IE il, a simple computation shows that 



• For X G dQ, that is '^Xi =0, we have Kf{x) = oo. 
Thus, we have 

^ dix,{x: Ex^ = 0}) ^ ^ n,ix,y)-\ 

Namely, 

(A.6) Kfix,y) 



A. 5. Single, multiple precision. Let / be a round-off computable function, and 
let Af be a BSS machine satisfying the definition of round-off computability above. 
This computation is single precision when for every < e < 1 there is a S — 6{e) 
such that any round-off machine (M, S) attains relative precision e for any input 
X € il, and such that 

CqC 



(A.7) S > 



Kf{xY^ dim(a;)'=3 
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for some positive constants cq, 02,03. This computation is multiple precision when 
a S exists such that 



(A.8) ,5 > 



for some Ci > 1. We say that the computation is strictly multiple precision when 
it is multiple precision but not single precision. 

Example 49. The inductive, naive algorithm for computing the round-off com- 
putable function f{xi, . . . , x„) = Xi - ■ ■ x„ defined in M.°° is single precision, from 
\A.3\) . The inductive, naive algorithm for computing the round-off computable func- 
tion f{x) = Xi + • • • + defined in {x G K°° : ^ 7^ 0} is single precision from 
m) and M) . 

A. 6. Size of an input. In many practical problems, we want to specify an output 
precision e. From our definition of round-off computable function, given x S 51 and 
< e < 1 a 5{x, e) will exist guaranteeing the desired precision, although it may 
be very hard to compute this 5 in some cases. Moreover, from (jA.Sp . the number 
Kf{x) will in general play a role in the value of 5{x,e) needed for any machine 
solving the problem. This dependence suggests that maybe the input should be 
considered as {x, e) and not just as x. These thoughts justify our definition of the 
size of an input, which includes a term related to e and another related to Kf(x): 

(A.9) ht{x) + I logel + log{K-f{x) + 1). 

where by ht{x) we mean 1 if a; = and 1 + ln{l + |/n(|a;|)|) ii x ^ 0. If x £ M" then 
by ht(x) we mean nmax/ii(xj) for j = 1 . . .n and Xj the components of x. 



A. 7. Cost of a computation. The cost of a computation on a round-off machine 
[M, S) which outputs y on input x is 

T{x,6) ■ (maxM(?/(*)) + |log5|) , 

where T(x, S) is the time for the computation to halt and 

arc the different vectors computed by (M, 5) on input x. 

We say that a function / : il — >■ R°° is polynomial cost computable if for every 
X G fl and < e < 1 there is exists 6{x, e) such that (M, S{x, e)) computes y which 
equals f{x) to relative error e with cost polynomially bounded by the input size 

The most important cases of polynomial cost computability will be in the cases 
where we restrict the space of functions to single (multiple) precision functions, 
for example in the case of single precision to the definition of polynomial cost we 
add the restriction that S{x,e) must satisfy (|A.7I) . These two possibilities (single 
or multiple precision) will give us two theories, both of which deserve to be worked 
out. 

Now that we have the notion of polynomial cost the classes P and NP may be 
defined and the problem: Does P = NP7 stated. 
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